CJK Experiments with Hummingbird SearchServerTM at NTCIR-5

نویسنده

  • Stephen Tomlinson
چکیده

Hummingbird submitted ranked result sets for the Chinese, Japanese and Korean Single Language Information Retrieval subtasks of the Cross-Lingual Information Retrieval Task of the 5th NII-NACSIS Test Collection for IR Systems Workshop (NTCIR-5). For short Chinese (title) queries, a decompounded wordbased approach produced higher (statistically significant) mean average precision and first relevant scores than an overlapping n-gram approach. For Korean queries, a word-based decompounding and stemming approach produced significantly higher mean average precision scores than plain word-based matching. For Japanese title queries, a blind feedback technique which produced a statistically significant increase in mean average precision also produced a statistically significant decrease in mean first relevant score.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NTCIR-10 CrossLink-2 Task: A Link Mining Strategy

At NTCIR-10 we participated in the cross-lingual link discovery (CrossLink-2) task. In this paper we describe our systems for discovering cross-lingual links between the Chinese, Japanese, and Korean (CJK) Wikipedia and the English Wikipedia. The evaluation results show that our implementation of the crosslingual linking method achieved promising results.

متن کامل

Using Wikipedia to Translate OOV Term on MLIR

We deal with Chinese, Japanese and Korean multilingual information retrieval (MLIR) in NTCIR-6, and submit our results on the C-CJK-T and C-CJK-D subtask. In these runs, we adopt Dictionary-Based Approach to translate query terms. In addition to tradition dictionary, we incorporate the Wikipedia as a live dictionary.

متن کامل

Experiments in Named Page Finding and Arabic Retrieval with Hummingbird SearchServerTM at TREC 2002

Hummingbird participated in the named page finding task of the TREC 2002 Web Track (find the named page in 18GB from the .GOV domain) and the monolingual Arabic topic relevance task of the TREC 2002 Cross-Language Track (find all relevant documents in 869MB of Arabic news data). In the named page finding task, SearchServer returned the named page in the first 10 rows for more than 80% of the 15...

متن کامل

UKP at CrossLink2: CJK-to-English Subtasks

This paper describes UKP’s participation in the cross-lingual link discovery task at NTCIR-10 (CrossLink2). The task addressed in our work is to find valid anchor texts from a Chinese, Japanese, and Korean (CJK) Wikipedia page and retrieve the corresponding target Wiki pages in the English language. The CrossLink framework was developed based on our previous CrossLink system that works on the o...

متن کامل

Automated Cross-lingual Link Discovery in Wikipedia

At NTCIR-9, we participated in the cross-lingual link discovery (Crosslink) task. In this paper we describe our approaches to discovering Chinese, Japanese, and Korean (CJK) cross-lingual links for English documents in Wikipedia. Our experimental results show that a link mining approach that mines the existing link structure for anchor probabilities and relies on the “translation” using cross-l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004